Contribution of Spatio-temporal Intensity Variation to Bottom-Up Saliency
نویسندگان
چکیده
We investigate the contribution of local spatio-temporal variation of image intensity to saliency. To measure different types of variation, we use the geometrical invariants of the structure tensor. With a video represented in spatial axes x and y and temporal axis t, the ndimensional structure tensor can be evaluated for different combinations of axes (2D and 3D) and also for the (degenerate) case of only one axis. The resulting features are evaluated on several spatio-temporal scales in terms of how well they can predict eye movements on complex videos. We find that a 3D structure tensor is optimal: the most predictive regions of a movie are those where intensity changes along all spatial and temporal directions. Among two-dimensional variations, the axis pair yt, which is sensitive to horizontal translation, outperforms xy and xt by a large margin, and is even superior in prediction to two baseline models of bottom-up saliency.
منابع مشابه
No-Reference Video quality assessment of H.264 video streams based on semantic saliency maps
The paper contributes to No-Reference video quality assessment of broadcasted HD video over IP networks and DVB. In this work we have enhanced our bottom-up spatio-temporal saliency map model by considering semantics of the visual scene. Thus we propose a new saliency map model based on face detection that we called semantic saliency map. A new fusion method has been proposed to merge the botto...
متن کاملSpatio-Temporal Variation of Suspended Sediment Concentration at Downstream of a Sand Mine
The growing population led to greater human need to use natural resources such as sand and gravel mines. Direct removal of sands from the bed river leads to increase suspended sediment concentrations in downstream of harvested area and creates other problems viz. filling reservoirs, change in hydraulic characteristics of the channel and environmental damages. However, the range of temporal and ...
متن کاملAutomatic video summarization driven by a spatio-temporal attention model
According to the literature, automatic video summarization techniques can be classified in two parts, following the output nature: “video skims”, which are generated using portions of the original video and “key-frame sets”, which correspond to the images, selected from the original video, having a significant semantic content. The difference between these two categories is reduced when we cons...
متن کاملQuantifying the Contribution of Low-Level Saliency to Human Eye Movements in Dynamic Scenes
We investigated the contribution of low-level saliency to human eye movements in complex dynamic scenes. Eye movements were recorded while naive observers viewed a heterogeneous collection of 50 video clips (46,489 frames; 4-6 subjects per clip), yielding 11,916 saccades of amplitude ≥ 2◦. A model of bottom-up visual attention computed instantaneous saliency at the instant each saccade started ...
متن کاملCompressed-Sampling-Based Image Saliency Detection in the Wavelet Domain
When watching natural scenes, an overwhelming amount of information is delivered to the Human Visual System (HVS). The optic nerve is estimated to receive around 108 bits of information a second. This large amount of information can’t be processed right away through our neural system. Visual attention mechanism enables HVS to spend neural resources efficiently, only on the selected parts of the...
متن کامل